Monitoring Message-Passing Parallel Applications in the Grid with GRM and Mercury Monitor
نویسندگان
چکیده
Application monitoring in the grid for parallel applications is hardly supported in recent grid infrastructures. There is a need to visualize the behavior of the program during its execution and to analyze its performance. In this paper the GRM application monitoring tool, the Mercury resource and job monitoring infrastructure and the combination of the two as a grid monitoring tool-set for message-passing parallel applications is described. By using them, one can see and analyze on-line the behavior and performance of the application. 1 Application monitoring in the grid There are several parallel applications that are used on a single cluster or a supercomputer. As users get access to an actual grid they would like to execute their parallel applications on the grid instead of the local computing resource for several reasons. Local resources are always limited in size and availability. They might not be capable of executing an application with a given size of input data so the user should look for a larger resource. Another limitation is that the local resource is likely to be occupied by other users for days or weeks and we may need to run our application today. In current grid implementations, we are already allowed to submit our parallel application to the grid and let it execute on a remote grid resource. However, those current grid systems are not able to give detailed information about our application during its execution except its status, like standing in a queue, running, etc. If we are interested in getting information about our application (how it is working, what its performance is) we realize that there is no way to collect such information. Performance analysis tools are not available yet for current grid systems. Our target of research has been to provide a monitoring tool that is able to collect trace information about an – instrumented – parallel application executed on a remote grid resource. This goal is different from providing a monitoring tool for metacomputing applications running on several resources simultaneously, i.e. being distributed applications. The latter goal requires the tool to be able to monitor processes on more than one resource at the same time that brings other requirements for the tool. Combining our GRM and Mercury Monitor tools we have achieved our goal and created an infrastructure that enables the user to collect performance information about The work described in this paper has been supported by the following grants: EU-DataGrid IST-2000-25182 and EU-GridLab IST-2001-32133 projects, the Hungarian Supergrid OMFB00728/2002 project, the IHM 4671/1/2003 project and the grant OTKA T042459.
منابع مشابه
Monitoring Message Passing Applications in the Grid with GRM and R-GMA
Although there are several tools for monitoring parallel applications running on clusters and supercomputers they cannot be used in the grid without modifications. GRM, a message-passing parallel application monitoring tool for clusters, is connected to the infrastructure of R-GMA, the information and monitoring system of the EU-DataGrid project in order to collect trace information about messa...
متن کاملA Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver
In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...
متن کاملFrom Cluster Monitoring to Grid Monitoring Based on GRM
GRM was originally designed and implemented as part of the P-GRADE graphical parallel program development environment running on supercomputers and clusters. In the framework of the biggest European Grid project, the DataGrid we investigated the possibility of transforming GRM to a grid application monitoring infrastructure. This paper presents the architectural redesign of GRM to become a stan...
متن کاملApplication Monitoring in the Grid with GRM and PROVE
GRM and PROVE were originally designed and implemented as part of the P-GRADE graphical parallel program development environment running on clusters. In the framework of the biggest European Grid project, the DataGrid project we investigated the possibility of transforming GRM and PROVE to a Grid monitoring infrastructure. This paper presents the results of this work showing how to separate GRM...
متن کاملParleda: a Library for Parallel Processing in Computational Geometry Applications
ParLeda is a software library that provides the basic primitives needed for parallel implementation of computational geometry applications. It can also be used in implementing a parallel application that uses geometric data structures. The parallel model that we use is based on a new heterogeneous parallel model named HBSP, which is based on BSP and is introduced here. ParLeda uses two main lib...
متن کامل